npj Precision Oncology — Latest Matching Preprints

1

Characterizing the Stability of Radiomics-Derived Tumor Habitats Using Image Perturbation in Head and Neck Cancer

Altinok, O.; Waqas, A.; Rasool, G.; Schabath, M. B.; Guvenis, A.

2026-06-02 radiology and imaging 10.64898/2026.05.30.26354532 medRxiv

Top 0.1%

10.5%

Show abstract

Tumor habitat imaging aims to capture intratumoral heterogeneity by grouping voxels with similar radiomic properties into spatially coherent subregions. However, radiomic features are known to be sensitive to small variations in image acquisition and processing, which can affect the stability of the resulting habitat maps. Feature repeatability is usually evaluated using test-retest scans, but such data are rarely available in clinical practice. To overcome this, we adopted an image perturbation framework, which simulates test-retest conditions by applying small, controlled changes to a single image. In head and neck cancer (HNC), where imaging is further complicated by complex anatomy, dental artifacts, and variability in tumor delineation, dedicated stability analyses are still missing. In this study, we evaluated how the repeatability of radiomic features affects habitat stability in 390 oropharyngeal cancer patients (discovery cohort). For each patient, 11 perturbed CT volumes were generated using small in-plane rotations, sub-voxel translations, and tumor-adaptive Gaussian noise. Ninety-three radiomic features were extracted from each image set, and their repeatability was assessed using the lower confidence limit of the intraclass correlation coefficient (ICC-LCL), grouped into poor, moderate, good, and excellent categories. Tumor habitats were then generated using K-means clustering (H = 3) for each feature subset, and habitat stability was measured by the Dice similarity coefficient (DSC) between habitat maps obtained from original and perturbed images. Overall, 48.4% of features were poorly repeatable and only 6.5% reached the excellent category, with first-order features being more stable than texture-based ones. Habitat stability followed a clear monotonic trend with feature repeatability: the median DSC was 0.93 for habitats generated from excellent features, 0.84 for good features, 0.75 for moderate features, and dropped to 0.41 for poorly repeatable features. Habitats generated using all features (without any repeatability-based filtering) yielded an intermediate median DSC of 0.52. All pairwise comparisons between feature subsets were statistically significant (p < 0.001). To evaluate the generalizability of these findings, the analysis was repeated in an independent external validation cohort of 372 oropharyngeal cancer patients treated at the H. Lee Moffitt Cancer Center. The stability classification showed substantial feature-level concordance between the discovery and validation cohorts (overall agreement 67.7%, quadratic-weighted Cohen's kappa = 0.78), with no feature shifting by more than two stability classes. The habitat-stability hierarchy was fully preserved in the validation cohort (median DSC of 0.87, 0.73, 0.69, and 0.39 for excellent, good, moderate, and poor features, respectively; all pairwise p < 0.001). These results show that selecting features with higher repeatability clearly improves the spatial consistency of habitat maps in HNC and support the use of perturbation-based stability analysis as a routine step in habitat imaging studies.

2

TumorArchetypeR: A modular framework to derive signature-based tumor subtypes

Luetge, M.; Nassiri, S.

2026-05-14 cancer biology 10.64898/2026.05.11.724259 medRxiv

Top 0.1%

10.5%

Show abstract

MotivationThe tumor microenvironment (TME) dictates cancer progression and therapeutic response, yet translating TME subtypes into robust clinical biomarkers remains a significant challenge. Existing classification models typically rely on static gene signatures and cohort-dependent normalization, making them ill-suited for application to the small, unbalanced datasets common in early-phase clinical trials. To better guide drug development, methods are required that offer the flexibility to target specific biological contexts and bridge the gap between the discovery of tumor archetypes and their robust translation to individual patient samples. ResultsWe developed TumorArchetypeR, a modular R package that unifies unsupervised subtype discovery with the generation of rank-based, single-sample classifiers. By leveraging a systematic parameter grid search, the framework identifies stable, data-driven subtypes rather than relying on arbitrary defaults. Crucially, to ensure clinical translatability, the package includes a module to train a robust classifier using binary gene-pair rules, enabling prediction without cohort-level preprocessing. Applying TumorArchetypeR to colorectal cancer, we resolved the heterogeneity of fibrotic tumors, distinguishing an immunosuppressive "Immune-enriched/Fibrotic" state from an immune-excluded "Fibrotic/Myeloid" phenotype. Furthermore, we identified a distinct "Th/B-cell enriched" archetype associated with superior survival, a group largely obscured by existing pan-cancer models. With our rank-based classifier demonstrating robust performance on previously unseen samples, these findings highlight TumorArchetypeR as a scalable, end-to-end solution for refining patient stratification and optimizing precision oncology strategies. The TumorArchetypeR package and documentation are openly available on GitHub at https://github.com/lutgem/TumorArchetypeR.

3

dbGIST: An LLM-Assisted Multi-Omics Resource for Target Exploration and Cross-Dataset Validation in Gastrointestinal Stromal Tumors

Sun, Z.; Zhao, Q.; Li, J.-H.; Li, J.-J.; Liu, H.; Guo, Y.-X.; Tang, Y.-D.; Yang, F.; Liu, X.; Peng, S.-F.; Mi, W.-n.; Zhang, G.; Zhang, Z.; Yuan, M.-L.; Li, G.-H.; Wang, Y.-F.; Liu, C.; Li, S.-L.; Yang, J.-H.; Fu, Y.

2026-05-26 cancer biology 10.64898/2026.05.22.727292 medRxiv

Top 0.1%

9.9%

Show abstract

Gastrointestinal stromal tumors (GISTs) are the most common mesenchymal neoplasms of the gastrointestinal tract, yet GIST-specific omics evidence remains scattered across small cohorts and is not represented as a dedicated disease project in major cancer genomics resources, limiting reproducible target exploration. Here, we present dbGIST (https://www.dbgist.com), a dedicated GIST-focused multi-omics resource built to make dispersed GIST evidence searchable, analyzable, and reusable. dbGIST harmonizes data from 37 centers and 1,991 samples, including pathologically verified in-house cohorts, across genomics, bulk transcriptomics, proteomics, phosphoproteomics, and single-cell transcriptomics, and couples these data with curated clinical annotations covering survival, mutation status, risk stratification, metastasis or recurrence, mitotic index, tumor site and size, and imatinib response. The platform supports cohort-level molecular-clinical association, survival, enrichment, immune-infiltration, drug-sensitivity, and single-cell analyses through interactive visualizations, downloadable source data, and public APIs for programmatic access to reusable analysis outputs and visualization-ready data. An optional LLM-assisted interface helps users navigate analyses and interpret outputs. Using MCM7 as a case study, dbGIST linked a resource-derived candidate to survival, risk features, metastatic or recurrent disease, imatinib-response phenotypes, proliferative cell states, and in vitro GIST-cell behavior. dbGIST therefore provides a traceable and interoperable resource for target exploration and precision oncology research in GIST.

4

Genotype and methylation interact to reconfigure transcriptional regulation in colorectal cancer

Kim, B.; Kim, H.; Kwon, M.-K.; Hannenhalli, S.; Choi, S. S.

2026-05-30 bioinformatics 10.64898/2026.05.27.728350 medRxiv

Top 0.1%

8.5%

Show abstract

BackgroundTranscriptional regulation is shaped by both genomic variants and the environment. Yet, how the regulatory effects of genomic variants are reconfigured by dynamic epigenomic changes during tumorigenesis remains incompletely understood. MethodsWe investigated methylation context-dependent links between genotype and gene expression in colorectal cancer (CRC) using paired tumor and normal-adjacent tissue (NAT) from 80 patients, thereby controlling for germline genomic background. By integrating promoter-targeted bisulfite sequencing with RNA-seq, we systematically compared expression quantitative trait loci (eQTLs) and methylation quantitative trait loci (mQTLs). To capture regulatory complexity beyond simple mediation, we implemented a memo-eQTL framework that explicitly models genotype x DNA methylation (GxM) interactions. ResultsWe observed extensive tissue specificity in both eQTL and mQTL landscapes; tumor-specific eGenes were significantly enriched for hallmark oncogenic pathways, including WNT and MAPK signaling. Standard mediation models explained only a minority of genotype-expression relationships, whereas our explicit interaction framework revealed widespread reconfiguration of methylation-dependent genetic effects in tumors. Memo-eQTL mapping (FDR < 0.05) identified 18 NAT and 73 tumor eGenes with significant GxM interactions, and results were consistent at a more permissive threshold (FDR < 0.2). We further developed a patient-level memo-eQTL score and found that interaction-based regulatory disruption in NAT, but not in tumor, significantly correlated with clinical stage (P = 0.035). ConclusionsGenetic regulation in cancer is reorganized through context-dependent GxM interactions. Importantly, GxM signatures in NAT are specifically linked to disease progression, offering new insights into field cancerization and the clinical consequences of regulatory reprogramming in CRC.

5

Addressing the Global Diagnostics Gap for Childhood Leukemias: A Global, Multisite Type 2 Hybrid Validation Study of Nanopore-based Adaptive Sampling Whole Genome Sequencing

Alexander, T. B.; Islam, R.; Aijaz, J.; Achterberg, T.; Bolous, N.; Cammel, K.; de Ridder, J.; Geyer, J.; Gray, S.; Groenewegen, N.; Hussain, S.; Imran, S.; Jamal, S.; Kar, S.; Kanavy, D.; Mansoor, N.; Parihar, M.; Saha, V.; Tops, B.; van Tuil, M.; Wilkins, D.; Weck, K.; Wu, G.; Zhou, L.; Kester, L.; Wang, J. R.; Bhakta, N.

2026-05-21 hematology 10.64898/2026.05.19.26353434 medRxiv

Top 0.1%

8.4%

Show abstract

Background: Modern therapy for childhood and adolescent leukemia requires accurate risk classification of genomic subtype. Although short-read next-generation sequencing (NGS)- based approaches provide comprehensive clinical diagnostics in limited, highly resourced settings, they remain expensive, slow, and inaccessible to most children worldwide. Transformative approaches are needed to improve diagnostic classification for leukemia globally. Methods: We simultaneously continued to develop an analytical pipeline NASVar (Nanopore variant calling for adaptive sampling), and conducted a multicenter, type-two hybrid clinical validation study of an Oxford Nanopore Technologies (ONT) adaptive-sampling whole-genome sequencing (asWGS) assay across hospitals with varying diagnostic resources. In preparation for implementation, a global panel developed a leukemia-based standardized gene set and consensus laboratory-developed test (LDT) validation guidelines. Measures of assay effectiveness compared to both conventional and orthogonal NGS methods, where available, were simultaneously collected with data to measure the implementation outcomes of feasibility, fidelity, appropriateness, and cost. Results: All four centers successfully completed the LDT validation, with minimal adaptations required for regulatory compliance. A total of 457 specimens were sequenced (331 B-ALL, 83 AML, 43 T-ALL). For the 210 B-ALL cases with locally resolved genomic subtypes defined by DNA alterations, asWGS was 100% concordant (210/210). Cases locally defined as B-other were resolved via asWGS with disease-defining DNA alterations in 47% (49/105) of cases. An additional 41% (43/105) of locally defined B-other cases were classified by incorporation of DNA methylation, and all 16 B-ALL patient-derived xenograft controls were correct, for a total of 96% (318/331) of all B-ALL cases in the cohort resolved with single assay asWGS. For AML, 97% (56/58) of cases with locally resolved genomic subtypes were identified by automated asWGS analysis, while an additional two cases were identified after targeted manual review. At Indus Hospital in Pakistan, the B-ALL and AML diagnostic genomic subtype yield increased from 28% with local standard of care diagnostic testing, to 84% with asWGS. The cost of reagents and consumables in the United States, assuming pooled three-plexing, was $343/sample. Based on the combined hybrid validation results, all centers are independently preparing for clinical return of results. Conclusions: ONT asWGS was successfully validated as a clinical assay in four diverse hospital settings. As a single, multi-omic platform that delivers value across the continuum of high-resource to resource-limited contexts, the approach offers a disruptive solution to address the global equity gap in cancer diagnostics.

6

Basal gland localization and focal distribution of OLFM4-expressing cells in increasing severity of gastric intestinal metaplasia

Sathe, A.; Meka, R.; Geier, B.; Long, R.; Wong, C.; Han, S.; Shen, J.; Amieva, M. R.; Ji, H. P.; Huang, R. J.

2026-05-20 cancer biology 10.64898/2026.05.14.725297 medRxiv

Top 0.1%

6.5%

Show abstract

Patients with gastric intestinal metaplasia (GIM), a precancerous lesion, are at high risk for progressing to gastric cancer. Identifying these patients is critical to enable gastric cancer interception. Current approaches rely primarily on histologic evaluation of GIM severity and extent, which may be improved by incorporating molecular features that distinguish high-risk lesions. Our prior single-cell and spatial transcriptomics study identified differentially expressed genes associated with the highest-risk category of GIM. They included ANPEP expressed in enterocytes and CPS1 and OLFM4 expressed in intestinal stem-like or progenitor cells. We evaluated the protein expression and localization of these three markers to understand the cellular features associated with GIM risk and their spatial distribution within metaplastic tissues. Using multiplex immunofluorescence, whole slide image analysis and confocal microscopy, we examined protein expression from 100 tissue biopsies annotated for metaplasia severity using the Operative Link on Gastric Intestinal Metaplasia Assessment (OLGIM) system. Tissue samples included control gastric tissue, GIM, dysplasia and adenocarcinoma. Quantitative whole slide image analysis demonstrated that CPS1 expression had a modest association with disease severity. Although ANPEP was strongly associated with GIM severity, it was also frequently expressed in stromal regions outside epithelial glands. In contrast, OLFM4 expression was largely restricted to epithelial glands and showed a strong association with increased OLGIM severity. These OLFM4-positive epithelial cells were present in discrete glandular foci that expanded with increasing severity of metaplasia. Within individual metaplastic glands, OLFM4 expression was highest at the gland base with decreased expression toward the gland surface. Overall, these findings identified OLFM4 as a protein marker associated with high-risk GIM. The spatial organization of OLFM4-expressing cells at the base of metaplastic glands and their focal expansion within tissues suggest the presence of a stem cell-like epithelial compartment that may contribute to the progression of GIM towards gastric cancer.

7

Non-Genetic Mechanisms of Fractional Resistance to Abemaciclib in Dedifferentiated Liposarcoma.

Bailey, L. E.; Wolff, S. C.; Zikry, T.; Sessions, G. A.; Whitman, A. A.; Titerina, E. K.; Raish, H.; Beane, J.; Purvis, J. E.; Spanheimer, P. M.

2026-05-26 cancer biology 10.64898/2026.05.22.727236 medRxiv

Top 0.1%

6.5%

Show abstract

Dedifferentiated liposarcoma is a rare mesenchymal malignancy driven by amplification of chromosome 12q13-15, which includes the oncogenes CDK4 and MDM2. CDK4 amplification provides a rationale for targeted therapy with CDK4/6 inhibitors, and abemaciclib has shown the most durable activity reported to date in this disease. Clinical responses, however, are incomplete and often transient, and the cellular features that allow tumor cells to continue proliferating during treatment are not well understood. To address this gap, we performed multiplexed single-cell imaging to quantify 17 cell-cycle regulators in both dedifferentiated liposarcoma cell line Lipo246 and surgically resected primary human cells exposed to abemaciclib. Both models contained a subpopulation of cells that retained phosphorylated retinoblastoma protein, a marker of cell proliferation, at the highest abemaciclib doses. These fractionally resistant cells were defined by selective enrichment of cyclin-dependent kinase 2 (CDK2), cyclin B1, and phosphorylated ribosomal protein S6 (pS6), and showed enhanced sensitivity to the CDK2 inhibitor, tagtociclib. Together, these findings reveal nongenetic cell cycle plasticity as a mechanism of escape from CDK4/6 inhibition in dedifferentiated liposarcoma and nominate CDK2 and the PI3K-mTOR pathway as candidate targets for combination therapy.

8

Determination of the practical utility of ESMO Scale for Clinical Actionability of molecular Targets (ESCAT): mapping OncoKB level 1 alterations using ESCAT

Kordes, M.; Chakravarty, D.; Boberg, E.; Creignou, M.; de Petris, L.; Karlsson, C.; Burstrom, L. L.; Suehnholz, S.; Yachnin, J.; Wiklander, O. P.; Haglund de Flon, F.

2026-05-20 oncology 10.64898/2026.05.16.26353390 medRxiv

Top 0.1%

6.4%

Show abstract

Background. The European Society for Medical Oncology (ESMO) Scale for Clinical Actionability of molecular Targets (ESCAT) ranks genomic alterations by the evidence supporting the predictive value of the molecular target for response to targeted therapies. No openly available, systematically curated set of standard care biomarkers mapped to the ESCAT framework exists to support clinical decision-making or harmonize biomarker interpretation. Methods. We mapped all OncoKBTM Level 1 biomarkers to ESCAT tiers using evidence cited by OncoKBTM, excluding abstract-only data. Eight board-certified oncologists and hematologists independently assigned ESCAT tiers, with discrepancies resolved through structured consensus meetings. Recurring evidence scenarios that did not correspond to any existing ESCAT tier informed a set of a priori defined modifications, which were subsequently applied to biomarkers that could not be classified using native ESCAT criteria. Results. Of 188 OncoKBTM Level 1 biomarkers, 16 were excluded due to abstract-only evidence. Using native ESCAT criteria, 51% of the remaining biomarkers were classified as Tier 1, 3% Tier 2, 18% Tier 3, 6% Tier X and 22% could not be assigned to any tier. Applying the modified ESCAT criteria resolved all previously unclassifiable biomarkers and increased Tier 1 assignments to 73%. Inter-rater reliability (Krippendorffs alpha) was moderate (0.586) and 62% of classifications required consensus discussions. Comparison with ESCAT tiers reported in ESMO Clinical Practice Guidelines showed improved concordance when using the modified criteria. Conclusions. The native ESCAT criteria are highly stringent, resulting in many FDA-recognized, clinically validated biomarkers that are currently assigned level 1 by OncoKBTM not mapping to any existing tier. Our predefined modifications improved alignment with OncoKBTM Level 1 designations and with published ESMO clinical practice guidelines. The mapped set of standard care biomarkers are provided on the OncoKBTM website, offering a practical resource that harmonizes ESCAT tiers of evidence with a widely adopted levels of evidence schema.

9

Equitable Health Intelligence: An Open Benchmark of Multi-Population Machine Learning for Omics-Based Cancer Prognosis

Sharma, T.; Chopra, A. P.; Agrawal, L.; Verma, N. K.; Starlard-Davenport, A.; Wang, J.; Hayes, D. N.; Cui, Y.

2026-06-02 bioinformatics 10.64898/2026.05.29.728755 medRxiv

Top 0.1%

6.4%

Show abstract

PurposeMachine learning (ML) models for omics-based cancer prognosis are often trained on data from predominantly European-ancestry populations, producing biased predictions for other populations and undermining equitable genomic medicine. Existing fairness benchmarks mainly focus on outcome parity rather than predictive performance parity across populations. Public benchmark resources are needed for systematically detecting and mitigating such performance disparities in multi-population cancer prognosis. MethodsWe developed Equitable Health Intelligence (EHI, https://ehiportal.org), an open-source benchmark of multi-population ML for omics-based cancer prognosis. EHI contains 1,475 ML tasks across 40 cancer/pan-cancer types, 4 omics feature sets, 4 clinical endpoints, 5 event-time thresholds, and 3 data-disadvantaged population (DDP) groups relative to a majority European Ancestry population group. Deep neural network models are trained under three multi-population ML schemes (Mixture, Independent, and Transfer Learning), with Naive Transfer included as a no-adaptation control, comprising a total of 10,325 ML experiments. ResultsThe EHI platform provides an interactive environment with visualization and exploratory tools for users to inspect predictive performance disparities between the majority European-ancestry group and data-disadvantaged populations, evaluate the extent to which transfer learning mitigates these disparities, and examine the impact of feature engineering methods across cancer types, omics features, and clinical endpoints. ConclusionEHI is an open, interactive, and extensible benchmark for identifying and addressing performance disparities in multi-population ML for omics-based cancer prognosis. It provides a foundation for a growing ecosystem of methods targeting ML performance disparities arising from biomedical data inequality and population-level distribution shifts, thereby advancing equitable AI in precision oncology.

10

Prevalence and Clinical Significance of Adult-Onset Cancer Predisposition Variants in Pediatric Oncology

Maciaszek, J. L.; Pastor Loyola, V.; Cain, T.; Cardenas, M.; Blackburn, P. R.; Wilkinson, M. R.; Koo, S. C.; Wu, C.-H.; Li, C.; Wang, L.; Nichols, K. E.; Klco, J. M.; Eldomery, M. K.

2026-06-08 genetic and genomic medicine 10.64898/2026.06.07.26354365 medRxiv

Top 0.1%

4.4%

Show abstract

Purpose: Pathogenic or likely pathogenic (P/LP) variants are increasingly identified in genes more commonly associated with adult-onset cancer predisposition, but their prevalence and relevance to children who present with cancer remain unclear. Methods: We retrospectively analyzed 1,280 consecutive pediatric patients with cancer who underwent clinical germline sequencing, using a virtual panel, from 2021 to 2024. Genes with P/LP variants were categorized as aoCPG or pediatric-onset cancer predisposition genes (poCPG) according to cancer risk before age 18 years and pediatric surveillance recommendations. Variant relevance was adjudicated using tumor diagnosis/histopathology, immunohistochemistry, and tumor molecular features and classified as primary, secondary, or indeterminate. Results: Among 1,280 patients, 197 (15.4%) harbored 211 P/LP variants across 54 genes. Sixty-six variants (31.3%) occurred in aoCPG, 87 (41.2%) in poCPG, and 58 (27.5%) were heterozygous variants in autosomal recessive genes. Among adult-onset variants, 7 (10.6%) were primary, 54 (81.8%) secondary, and 5 (7.6%) indeterminate. Among pediatric-onset variants, 77 (88.5%) were primary and 10 (11.5%) secondary. Six patients (3 adult-onset variants; 3 pediatric-onset variants) received targeted therapy informed by germline/somatic sequencing results. Conclusion: In pediatric oncology, most variants in aoCPG are secondary rather than tumor-related findings. Tumor-informed interpretation, beyond variant classification, may improve reporting, counseling, and therapeutic decision-making

11

Deep Learning Spatial Profiling of CD103+CD8+ T Cells and Survival in Rectal Cancer After Neoadjuvant Chemoradiotherapy

Abe, T.; Yamashita, K.; Nagasaka, T.; Fujita, M.; Ueda, Y.; Miyake, S.; Ito, R.; Adachi, Y.; Ando, M.; Tsuneki, T.; Okazoe, Y.; Konaka, R.; Takahashi, T.; Kagiyama, H.; Tachibana, T.; Imai, M.; Yoshida, T.; Saito, M.; Mukohyama, J.; Kanayama, K.; Koma, Y.-I.; Otowa, Y.; Hasegawa, H.; Ikeda, T.; Koterazawa, Y.; Aoki, T.; Harada, H.; Urakawa, N.; Goto, H.; Kanaji, S.; Yanagimoto, H.; Matsuda, T.; Takamura, S.; Yamashita, T.; Sasaki, R.; Fukumoto, T.; Kakeji, Y.

2026-05-28 oncology 10.64898/2026.05.26.26353629 medRxiv

Top 0.1%

4.2%

Show abstract

Background: CD8+ tumor-infiltrating lymphocytes (TILs) are established prognostic markers in colorectal cancer, yet the clinical significance of CD103+CD8+ tissue-resident memory-like (TRM-like) T cells in locally advanced rectal cancer (LARC) after neoadjuvant chemoradiotherapy (NACRT) remains unknown. Methods: We quantified CD8+ and CD103+CD8+ T-cell densities in stromal and intratumoral compartments of post-NACRT resection specimens from 40 LARC patients using Cu-Cyto, a deep learning-based imaging cytometry platform. Associations with survival, pathological response, and adjuvant chemotherapy (AC) were examined. Treatment-induced T-cell dynamics were assessed in paired pretreatment biopsies and post-NACRT resections (n = 9). Results: High stromal CD103+CD8+ density independently predicted better 5-year RFS (67.4% vs. 12.1%, p < 0.001) and OS (80.0% vs. 26.6%, p = 0.016); intratumoral density showed no prognostic significance. Pathological response correlated with stromal CD8+ but not CD103+CD8+ density. Paired analysis revealed a selective non-expansion of the CD103+ subset: stromal CD8+ T cells increased significantly after NACRT while CD103+CD8+ density remained unchanged. AC may preferentially benefit patients with low stromal CD103+CD8+ density. Conclusions: Stromal CD103+CD8+ T-cell density is a robust independent prognostic biomarker in rectal cancer after NACRT that appears to reflect pre-existing rather than treatment-induced immunity. Given its stability across NACRT, pretreatment biopsy assessment may provide equivalent prognostic information, with potential implications for patient stratification before treatment initiation.

12

An 8 Gene Bevacizumab Resistance Signature Predicts Prognosis and Reveals Immunosuppressive Microenvironment in Colorectal Cancer

Niu, Z.; Qiu, D.; Xu, P.

2026-05-20 bioinformatics 10.64898/2026.05.17.725749 medRxiv

Top 0.1%

4.1%

Show abstract

BackgroundBevacizumab resistance severely limits long-term efficacy in metastatic colorectal cancer (CRC). This study aimed to develop and validate a bevacizumab resistance-associated gene signature for prognosis prediction and immune microenvironment characterization in CRC. MethodsTwo GEO datasets (GSE19862, GSE86582) with bevacizumab response data and TCGA-COAD/READ RNA-seq data were analyzed. Overlapping differentially expressed genes (DEGs) linked to both CRC progression and bevacizumab resistance were identified. An 8-gene signature (AXIN2, PSORS1C1, KRT74, SLC2A3, STIL, IL33, GALNT6, HSD11B2) was constructed via univariate Cox and LASSO-Cox regression. ResultsIn the TCGA cohort, high-risk patients had shorter overall survival (OS; log-rank P < 0.0001). Time-dependent ROC yielded 1-year AUC = 0.638, 3-year AUC = 0.657, and 5-year AUC = 0.757. Multivariate Cox regression confirmed the risk score as an independent prognostic factor. External validation in GSE39582 (optimal cutoff = -1.49) replicated these findings: high-risk patients had inferior OS (P = 0.0016) with acceptable 1/3/5-year AUCs and retained independent prognostic value (HR = 1.634, P = 0.00415). CIBERSORT and ESTIMATE analyses showed that the high-risk group was characterized by increased M2 macrophages and neutrophils, higher immune and stromal scores, and reduced activated memory CD4+ T cells, monocytes, and activated dendritic cells (all P < 0.05). GSEA highlighted enrichment of TNF-/NF-{kappa}B, IL-6/JAK/STAT3, and immune checkpoint pathways in the high-risk group. AXIN2 (HR = 0.829, P = 0.032) was an independent protective factor, while PSORS1C1 (HR = 1.356, P = 0.048) was an independent risk factor. ConclusionThe 8-gene bevacizumab resistance signature robustly predicts prognosis and reflects an immunosuppressive microenvironment closely linked to bevacizumab failure in CRC. These findings provide novel insights into immune-mediated resistance and support clinical risk stratification.

13

Quantifying Cancer Clinical Trial Eligibility Using Artificial Intelligence-Based Matching

Goel, K. P.; Myall, N. J.; Dickerson, J.; Caswell-Jin, J. L.; Johnson, T.; Worth, J. E.; Gensheimer, M. F.

2026-06-05 oncology 10.64898/2026.06.03.26354859 medRxiv

Top 0.1%

4.0%

Show abstract

PURPOSE: To develop and validate an artificial intelligence-enabled platform that converts unstructured cancer trial eligibility criteria into structured queries and quantifies trial eligibility across advanced/metastatic cancer trials. METHODS: We downloaded actively recruiting US interventional treatment trials for advanced/metastatic breast cancer, colon cancer, and non-small cell lung cancer from ClinicalTrials.gov. Medical oncologists created 24 synthetic patient vignettes. A large language model converted trial eligibility criteria into Structured Query Language (SQL) code and patient information into structured records, enabling automated matching. Cancer details and treatment history were considered, but not laboratory results or comorbidities. Validation included physician editing of generated eligibility code for 30 trials, and blinded physician eligibility assessment for five trials. We then evaluated how age, ECOG performance status, sex, and ZIP code affected the number of eligible trials. RESULTS: Of 833 candidate trials, 746 met inclusion criteria. In physician review of 30 trials, edits to generated SQL did not change any of 720 trial-patient eligibility determinations for 24 synthetic patients. In blinded validation across 120 trial-patient pairs, automated matching achieved 97% accuracy. Across synthetic patients, eligible trials ranged from 31 to 258 when there were no geographic restrictions. Eligibility decreased markedly with worse performance status and with geographic restriction (both p<0.001). Later-phase, randomized, and molecularly selective trials had fewer eligible patients. CONCLUSION: AI-based structuring of trial eligibility criteria can support accurate, scalable measurement of potential cancer trial eligibility. In this demonstration, performance status, geography, and age were major determinants of eligibility across the active metastatic trial landscape.

14

MyeGPT: an AI agent for Multiple Myeloma

Chang, J. G.; Gout, A. M.; Rodiger, J.; Chung, T.-H.; Mulligan, G.; Chng, W. J.

2026-05-20 hematology 10.64898/2026.05.14.26353252 medRxiv

Top 0.1%

4.0%

Show abstract

Today, advancements in our understanding of cancer biology are increasingly attributed to large-scale clinical-molecular datasets. The case in point for multiple myeloma, the second-most prevalent haematological malignancy, is the CoMMpass study, a dataset with the paired clinical and sequencing data of 1,143 patients. Given its complexity, the multi-omics data of CoMMpass demands programming skills which imposes a hurdle for experimental myeloma researchers who want to validate their hypotheses on population data. The rise of agentic AI over the past few years presents unparalleled opportunities to bridge this technical gap. We propose MyeGPT (Myeloma Generative Pretrained Transformer), an AI bioinformatician for multiple myeloma that relies on the CoMMpass dataset as its ground truth. MyeGPT converts natural language queries such as 'What are the characteristics of patients who relapse after induction therapy' or 'Compare the overall survival of high vs normal NSD2 expression' into de novo analyses backed on real data, then pro-actively generates plots to visualize the results. We develop a set of evaluation questions based on CoMMpass, complete with scoring criteria, and ran benchmarks to identify the best choice for LLMs and text-embedding models. We package MyeGPT as a ready-to-use browser application, enabling CoMMpass-grounded hypothesis validation from a smartphone.

15

A Bioprinted Head and Neck Cancer Organoid-Based Platform for Evaluating Multimodal Therapies

Lin, L.; Bommakanti, K. K.; Wooten, C.; Gonzalez, A. E.; Alhiyari, Y.; Levi, J.; Wang, B.; Sannajust, A.; Evans, L. K.; Tebon, P.; St. John, M. A.; Soragni, A.

2026-05-21 cancer biology 10.64898/2026.05.20.726741 medRxiv

Top 0.2%

3.6%

Show abstract

Treatment of advanced head and neck squamous cell carcinoma (HNSCC) often involves radiotherapy combined with chemotherapy, targeted therapy, or immunotherapy. However, due to its anatomical and molecular heterogeneity, identifying the most effective treatment for each patient remains a major clinical challenge. To address this need, we developed a high-throughput organoid-based drug screening platform that uses patient-derived organoids to assess candidate treatment regimens. We validated the platform by establishing bioprinted 3D organoids of human HNSCC cell lines and exposing them to X-ray radiation in combination with various small-molecule drugs and biologics. We quantified viability using ATP release assays and assessed extracellular matrix (ECM) invasion with a machine learning-based brightfield image analysis pipeline. Proof-of-concept experiments with HPV-negative HNSCC lines (HN30 and HN31, established from primary and metastatic disease from the same patient) and HPV-positive HNSCC cells (SCC154) revealed different therapy agents that can radiosensitize each cell line. Image analysis showed that copanlisib, afatinib, and ibrutinib could limit ECM invasion of HN31, while the AKT inhibitor ipatasertib promotes invasion of HN30 cells, consistent with previous studies. Application of the platform to patient-derived HPV+ oropharyngeal tumor organoids showed that they shared sensitivity to several agents while also exhibiting differences against certain therapies. Cetuximab, sorafenib, and nedisertib significantly radiosensitized organoids from two clinical samples. This work demonstrates the feasibility of performing sensitivity screening by integrating bioprinting, conventional viability assays, and advanced image analysis techniques. This platform has the potential to enable a personalized therapeutic pipeline for patients with advanced HNSCC, optimizing responses to radiotherapy and targeted agents to improve clinical outcomes while avoiding modulators that may promote tumor invasion.

16

Integrated T-Cell Receptor Repertoire and Tumor Immunogenicity Profiling Reveals Distinct Immunogenomic States in Endometrial Cancer

Aversa, I.; Abatino, A.; Isabello, A.; Gallo, R.; Isdraele, L.; Straface, T.; Zullo, F. M.; Guida, M.; Saccone, G.; Fiume, G.; Venturella, R.; Viglietto, G.; Cuda, G.; Costanzo, F.; Zullo, F.; Palmieri, C.

2026-06-10 oncology 10.64898/2026.06.08.26355191 medRxiv

Top 0.2%

3.6%

Show abstract

Background Endometrial cancer exhibits marked molecular and immune heterogeneity that is only partially explained by established genomic biomarkers. We investigated whether T cell receptor (TCR) repertoire architecture captures complementary dimensions of antitumor immunity beyond conventional molecular classification. Methods Paired tumor and peripheral blood samples from eight patients with molecularly characterized endometrial cancer underwent TCR repertoire profiling. Diversity, clonality, and tumor blood overlap metrics were integrated with genomic variables, including tumor mutational burden (TMB), genomic instability metric (GIM), and POLE status. Principal component analysis and correlation analyses were used to identify major dimensions of repertoire organization. Composite Immune Focusing and Immune Sharing Scores were derived to summarize dominant repertoire patterns. Results The first two principal components explained 70.1% of total repertoire variance and revealed substantial heterogeneity independent of histological subtype. TMB was strongly associated with reduced repertoire diversity and increased clonal dominance, resulting in a robust association with the Immune Focusing Score ({rho} = 0.88, p = 0.004). POLE mutated tumors occupied the extreme end of this focusing continuum. In contrast, genomic instability was associated with increased tumor blood repertoire overlap and preserved diversity, reflected by a strong correlation between GIM and the Immune Sharing Score ({rho} = 0.76, p = 0.027). The two immune scores showed minimal correlation with each other ({rho} = -0.24, p = 0.57), indicating that they capture largely independent aspects of immune organization. Conclusion Integrative analysis of TCR repertoire architecture and tumor genomics identifies distinct immunogenomic states in endometrial cancer that are not fully captured by conventional molecular classification. If validated in larger cohorts, immune focusing and immune sharing metrics may provide complementary biomarkers for patient stratification and immunotherapy-oriented precision oncology

17

Novel COX-2 Targeted Nanobodies for Molecular Endoscopic Imaging of Colorectal Adenomas

Uddin, M. J.; Xu, S.; Goodman, M. C.; Aleem, A. M.; Niitsu, H.; Rose, K. L.; Crews, B. C.; Banerjee, S.; DeJulius, C. R.; Hoogenboezem, E. N.; Kingsley, P. J.; Reyzer, M. L.; Klendworth, J.; Milad, M.; Lin, S.; Wadzinski, B.; Spiller, B. W.; Duvall, C. L.; Coffey, R. J.; Marnett, L. J.

2026-05-19 bioengineering 10.64898/2026.05.16.724741 medRxiv

Top 0.2%

3.2%

Show abstract

Colorectal cancer (CRC) is one of the leading causes of cancer-related mortality in men and women. Timely detection and diagnosis are key to management of CRC, which is under-diagnosed because colorectal aberrant crypt foci, hyperplastic polyps, and microadenomas are often missed with conventional colonoscopy. The enzyme cyclooxygenase-2 (COX-2) is overexpressed in early stages of colorectal carcinogenesis and plays an important regulatory role in the process, suggesting that it could be a valuable target for enhanced imaging of nascent disease. Thus, we have generated an alpaca-derived library of 73 COX-2-specific nanobody clones. Here, we describe one such nanobody, F9-K45Q-K77Q-ROX, in which two native lysine residues have been mutated followed by conjugation to a fluorophore at the N-terminus with retention of COX-2-selective binding. The site of fluorophore conjugation and COX-2 binding affinity of F9-K45Q-K77Q-ROX were determined by proteomic and microscale thermophoretic analyses, respectively. In cell culture studies using 1483 human head and neck squamous cell carcinoma cells, F9-K45Q-K77Q-ROX accumulated inside cells and bound to intracellular COX-2, as visualized by fluorescence microscopy. In vivo pharmacokinetic, and toxicological analyses revealed that F9-K45Q-K77Q-ROX is detectable in circulation with a plasma half-life of 17.9 min and there is no short-term toxicity associated with single injections of 10 mg/kg, 20 mg/kg, or 40 mg/kg doses at 24 h post-administration. Noninvasive in vivo fluorescence endoscopic imaging validated tumor-specific accumulation of F9-K45Q-K77Q-ROX in azoxymethane/dextran sodium sulfate-induced colorectal adenomas in mice. This work demonstrates the first COX-2-targeted nanobodies including a fluorescent derivative that offers significant promise for targeted endoscopic imaging of COX-2-expressing neoplasms. Significance StatementCurrent colorectal cancer screening procedures, such as white-light colonoscopy, chromoendoscopy, and narrow-band imaging aim to detect solid colon tumors and precursor lesions. However, these methods tend to detect only raised solid tumors and mature cancers, whereas precursor lesions, such as aberrant crypt foci, hyperplastic polyps, and small adenomas are frequently missed. To address the need for better visualization of early lesions, we developed a library of alpaca-derived nanobodies targeted to cyclooxygenase-2 (COX-2), an enzyme that is overexpressed in colorectal adenomas. COX-2-targeted nanobodies bearing a fluorescent tag accumulate and are retained in colonic adenomas, facilitating their endoscopic visualization. This novel COX-2-targeted nanobody platform may also be valuable for early detection of other neoplastic diseases in which COX-2 overexpression occurs. (Word counts 119, limit 120)

18

Ascites-Derived Organoids for Prediction of Treatment Response and Clinical Management in Ovarian Cancer

Arias-Diaz, A. E.; Fernandez Diaz, N.; Perez-Beliz, E.; Otero-Alen, M.; Vilar, A.; Diaz, E.; Moreno-Bueno, G.; Dominguez-Medina, E.; Bernardez, B.; Lopez-Lopez, R.; Curiel, T.; Abal, M.

2026-05-20 oncology 10.64898/2026.05.13.26352440 medRxiv

Top 0.2%

3.1%

Show abstract

High grade serous ovarian cancer patients initially respond to platinum-based chemotherapy, but usually relapse within two years and ultimately develop therapy resistance. Management of response and effective clinical decisions are currently based on unspecific biomarkers and limited imaging techniques, illustrating the clear clinical need for reliable predictors of response. In this work, we evaluated the performance of patient-derived organoids generated from ascitic fluid and functionally tested in parallel to the patients clinical course, in the prediction of treatment response, and guiding clinical decision-making in a patient-specific manner. Ascites derived organoids reliably recapitulated the histological and molecular features of a paradigmatic HGSOC patient with an apparent dissociated response, and demonstrated chemoresistance months before laparoscopy confirmed persistent inoperable disease with poor pathological response. Drug screening identified alternative therapeutic options, while multi-omics provided additional insights into the tumor-specific biological features, to assist in the personalized clinical management in ovarian cancer.

19

Formalising Limits of Circulating Tumour DNA Detection: A Signal Detection Framework for Clinical Threshold Specification

Walinjkar, A.

2026-06-10 oncology 10.64898/2026.06.08.26355204 medRxiv

Top 0.2%

3.0%

Show abstract

Background: Circulating tumour DNA (ctDNA) liquid biopsy is now established across oncology for early cancer detection, minimal residual disease surveillance, and treatment monitoring. Detection thresholds for all current ctDNA assays are derived empirically through receiver operating characteristic analysis on training cohorts - a statistically valid but theoretically uninformed approach that does not specify the minimum detectable tumour fraction given assay technical characteristics, nor identify when increasing sequencing depth ceases to provide additional clinical information. Methods: We model ctDNA detection as a binary hypothesis testing problem with Binomial-distributed mutant allele counts against a sequencing error noise floor. The Neyman-Pearson lemma is applied to derive the uniformly most powerful detector and the minimum detectable tumour fraction in closed form. The sequencing assay is modelled as a binary symmetric channel and Shannon channel capacity is calculated. Empirical validation uses n=61 data points extracted from five published peer-reviewed analytical validation studies across five independent institutions in the US and EU (2018 - 2025): Yu et al. 2022, Stetson et al. 2018, Frydendahl et al. 2023, Northcott et al. 2024, and Cheng et al. 2025. Results: The minimum detectable tumour fraction is derived in closed form as f_min approximately equal to (z_alpha + z_beta) multiplied by the square root of (epsilon divided by N), where N is sequencing depth, epsilon is the platform error rate, and z_alpha, z_beta are standard normal quantiles at the specified false positive and false negative rates. Shannon channel capacity is C = 1 minus H(epsilon) bits per read, where H(epsilon) is binary entropy. Empirical validation yields 84.3% agreement for single-locus assays. Discordance for multi-locus tumour-informed assays (NeXT Personal, duplex WGS) is consistent with the single-locus model scope and identifies the principal theoretical extension required. Conclusions: This framework provides the first formal Neyman-Pearson optimality proof for ctDNA detection, a closed-form detection limit, and a platform-independent efficiency metric for NHS and regulatory standardisation. Keywords: circulating tumour DNA; liquid biopsy; Neyman-Pearson detection; Shannon channel capacity; sequencing depth; limit of detection; minimal residual disease; signal detection theory

20

Convection-enhanced delivery of dexamethasone in glioma suppresses myeloid inflammation while avoiding systemic toxicities

Rolfe, N. W.; Dadario, N. B.; Lei, L.; Tang, A. J.; Amini, M.; Teasley, D. E.; Ifediora, N.; Chabot, P. J.; Winans, N. J.; Yoh, N.; Furnari, J.; Kotidis, C.; Stucke, C. H.; Urena, N. M.; Sun, Y.; Brand, A.; Viswanathan, A.; Upadhyayula, P.; Argenziano, M. G.; Sperring, C. P.; Khoury, N.; Humala, N.; Neira, J.; Sims, P. A.; Gill, B. J.; Canoll, P.; Bruce, J. N.

2026-05-22 cancer biology 10.1101/2025.09.24.677899 medRxiv

Top 0.3%

2.9%

Show abstract

Dexamethasone is widely used to control cerebral edema and inflammation in glioblastoma, but its benefits are limited by systemic toxicities and adverse prognostic associations. We evaluated local administration of dexamethasone via convection-enhanced delivery (CED) to maximize intratumoral anti-inflammatory effects by increasing local corticosteroid exposure while minimizing systemic exposure. In two glioma mouse models, continuous intraparenchymal infusion of dexamethasone was well tolerated with no adverse effects. Pharmacokinetic analyses supported preferential intratumoral distribution and reduced systemic exposure with CED compared with systemic dosing. Single-nucleus RNA sequencing (snRNA-seq) and immunohistochemistry showed attenuation of glioma-associated inflammation with downregulation of reactive microglial/macrophage programs and reduced tumor-infiltrating myeloid cells with a morphology consistent with a less activated state. Experiments in human induced pluripotent stem cell (iPSC)-derived microglia confirmed that dexamethasone directly suppresses inflammatory gene expression, indicating a conserved mechanism across species. This inflammatory suppression was recapitulated in both immortalized microglial (HMC3) and macrophage (THP1) cell lines. These findings suggest that localized dexamethasone delivered by CED reprograms the glioma immune microenvironment and achieves control of inflammation without the systemic adverse effects associated with standard systemic dexamethasone therapy. This clinically translatable strategy may improve symptom management and provide a platform for integrating local immunomodulation with future glioblastoma therapies.